Keyword [DeepLabv3] [Dilated Conv] [ASPP]
Chen L, Papandreou G, Schroff F, et al. Rethinking Atrous Convolution for Semantic Image Segmentation[J]. arXiv: Computer Vision and Pattern Recognition, 2017.
1. Overview
In this paper, it proposes DeepLabv3 for segmentation.
- Design modules which employ Atrous Conv in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates (multi-grid).
- Augment Atrous Spatial Pyramid Pooling (ASPP) module.
- Remove CRF.
1.1. Multi-scale Methods
2. Details
2.1. Cascaded Modules
2.2. Parallel Modules
2.3. Multi-graid Method
1) Apply different atrous rates to 3 Convs within $block4$ to $block7$.
2) If $MultiGrid=(1,2,4)$ and $rates=2$, then $MultiGrid=2 \cdot (1,2,4)$.
2.4. ASPP
1) Contains:3 Dilated Conv, 1 $1 \times 1$ Conv and Global AVGPool.
2) When rate is too large, Dilated Conv degrades to $1 \times 1$ Conv.